{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "1db53cc2-af26-403b-8500-8c1b937c1076",
   "metadata": {},
   "source": [
    "# Labelme转mask-批量\n",
    "\n",
    "同济子豪兄 2023-6-8"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "661b866c-2151-4d57-874b-28a4af09b6c4",
   "metadata": {},
   "source": [
    "## 下载样例数据集"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "8a15147e-de0f-4a9f-ba70-b0edac5e2fb5",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--2023-06-09 09:09:27--  https://zihao-openmmlab.obs.cn-east-3.myhuaweicloud.com/20230130-mmseg/dataset/watermelon/Watermelon87_Semantic_Seg_Labelme.zip\n",
      "正在连接 172.16.0.13:5848... 已连接。\n",
      "已发出 Proxy 请求，正在等待回应... 200 OK\n",
      "长度： 12978831 (12M) [application/zip]\n",
      "正在保存至: “Watermelon87_Semantic_Seg_Labelme.zip”\n",
      "\n",
      "Watermelon87_Semant 100%[===================>]  12.38M  26.4MB/s    用时 0.5s    \n",
      "\n",
      "2023-06-09 09:09:28 (26.4 MB/s) - 已保存 “Watermelon87_Semantic_Seg_Labelme.zip” [12978831/12978831])\n",
      "\n"
     ]
    }
   ],
   "source": [
    "!rm -rf Watermelon87_Semantic_Seg_Labelme.zip Watermelon87_Semantic_Seg_Labelme # 删除原有压缩包和文件夹\n",
    "!wget https://zihao-openmmlab.obs.cn-east-3.myhuaweicloud.com/20230130-mmseg/dataset/watermelon/Watermelon87_Semantic_Seg_Labelme.zip # 下载压缩包\n",
    "!unzip Watermelon87_Semantic_Seg_Labelme.zip >> /dev/null # 解压压缩包\n",
    "!rm -rf Watermelon87_Semantic_Seg_Labelme.zip # 删除压缩包"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3740a3d5-5bad-44c5-9fc6-f515e3c5596f",
   "metadata": {},
   "source": [
    "## 查看数据集目录结构"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "25f20f9a-12c1-417d-a5ff-f1842ba8e6c0",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "📁 Watermelon87_Semantic_Seg_Labelme/\n",
      "├─📁 images/\n",
      "└─📁 labelme_jsons/\n"
     ]
    }
   ],
   "source": [
    "import seedir as sd\n",
    "sd.seedir('Watermelon87_Semantic_Seg_Labelme', style='emoji', depthlimit=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bbd4af86-7fa5-4b04-ba91-681b19e7ca4f",
   "metadata": {},
   "source": [
    "## 删除系统自动生成的多余文件"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13f5c084-4e77-4bb6-9df2-20b67e13233f",
   "metadata": {},
   "source": [
    "### 查看待删除的多余文件"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "ad569d4f-6f41-47b1-bdf1-51c9b1f3f22f",
   "metadata": {},
   "outputs": [],
   "source": [
    "!find . -iname '__MACOSX'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "69740a5b-5eb7-4807-adab-605e12723caf",
   "metadata": {},
   "outputs": [],
   "source": [
    "!find . -iname '.DS_Store'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "c4471e19-e056-4808-83b3-4131bd9f4a67",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "./.ipynb_checkpoints\n"
     ]
    }
   ],
   "source": [
    "!find . -iname '.ipynb_checkpoints'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "002ab232-1050-4234-892b-213c1d5aab86",
   "metadata": {},
   "source": [
    "### 删除多余文件"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "1d15e453-4997-4f2d-9ddb-b065d3b205de",
   "metadata": {},
   "outputs": [],
   "source": [
    "!for i in `find . -iname '__MACOSX'`; do rm -rf $i;done"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "5076f7be-5d16-473e-b3cc-4943ad39fab8",
   "metadata": {},
   "outputs": [],
   "source": [
    "!for i in `find . -iname '.DS_Store'`; do rm -rf $i;done"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "bf534009-e7ad-42e6-9bbc-207407291768",
   "metadata": {},
   "outputs": [],
   "source": [
    "!for i in `find . -iname '.ipynb_checkpoints'`; do rm -rf $i;done"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2288f722-a710-43fa-8145-559ac9ed424f",
   "metadata": {},
   "source": [
    "### 验证多余文件已删除"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "de8b2643-dbc4-44fb-a123-d6b87da87815",
   "metadata": {},
   "outputs": [],
   "source": [
    "!find . -iname '__MACOSX'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "5d33dddc-62e7-4396-b77f-c932a4bf22ae",
   "metadata": {},
   "outputs": [],
   "source": [
    "!find . -iname '.DS_Store'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "c3aa1a06-bbb3-4a76-af49-1ce36dbf86e5",
   "metadata": {},
   "outputs": [],
   "source": [
    "!find . -iname '.ipynb_checkpoints'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4ff99cf7-4d42-4054-a7a8-a433f4e246f1",
   "metadata": {},
   "source": [
    "## 进入数据集目录"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "0760c2f6-b0f5-4111-b70a-921e30ca3957",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import json\n",
    "import numpy as np\n",
    "import cv2\n",
    "import shutil\n",
    "\n",
    "from tqdm import tqdm"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6254a685-2d79-4406-a103-4e3729212b66",
   "metadata": {},
   "source": [
    "## 数据集及类别信息"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "fe93bf2f-3ac7-4570-a0ae-73703fa83221",
   "metadata": {},
   "outputs": [],
   "source": [
    "Dataset_Path = 'Watermelon87_Semantic_Seg_Labelme'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df36586d-0c8d-4041-877e-05ec3bde5c9b",
   "metadata": {},
   "source": [
    "## 每个类别的信息及画mask的顺序（按照由大到小，由粗到精的顺序）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "ed22f2fc-e0d4-45c8-8184-4cf8007901ea",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 0-背景，从 1 开始\n",
    "class_info = [\n",
    "    {'label':'red', 'type':'polygon', 'color':1},                    # polygon 多段线\n",
    "    {'label':'green', 'type':'polygon', 'color':2},\n",
    "    {'label':'white', 'type':'polygon', 'color':3},\n",
    "    {'label':'seed-black','type':'polygon','color':4},\n",
    "    {'label':'seed-white','type':'polygon','color':5}\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8c9ff801-0e6d-4ae7-84d1-e501ba73d9a8",
   "metadata": {},
   "source": [
    "## 单张图像labelme转mask函数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "cf1f8fc4-3acf-4dd3-b159-29d29ce4d7d5",
   "metadata": {},
   "outputs": [],
   "source": [
    "def labelme2mask_single_img(img_path, labelme_json_path):\n",
    "    '''\n",
    "    输入原始图像路径和labelme标注路径，输出 mask\n",
    "    '''\n",
    "    \n",
    "    img_bgr = cv2.imread(img_path)\n",
    "    img_mask = np.zeros(img_bgr.shape[:2]) # 创建空白图像 0-背景\n",
    "    \n",
    "    with open(labelme_json_path, 'r', encoding='utf-8') as f:\n",
    "        labelme = json.load(f)\n",
    "        \n",
    "    for one_class in class_info: # 按顺序遍历每一个类别\n",
    "        for each in labelme['shapes']: # 遍历所有标注，找到属于当前类别的标注\n",
    "            if each['label'] == one_class['label']:\n",
    "                if one_class['type'] == 'polygon': # polygon 多段线标注\n",
    "\n",
    "                    # 获取点的坐标\n",
    "                    points = [np.array(each['points'], dtype=np.int32).reshape((-1, 1, 2))]\n",
    "\n",
    "                    # 在空白图上画 mask（闭合区域）\n",
    "                    img_mask = cv2.fillPoly(img_mask, points, color=one_class['color'])\n",
    "\n",
    "                elif one_class['type'] == 'line' or one_class['type'] == 'linestrip': # line 或者 linestrip 线段标注\n",
    "\n",
    "                    # 获取点的坐标\n",
    "                    points = [np.array(each['points'], dtype=np.int32).reshape((-1, 1, 2))]\n",
    "\n",
    "                    # 在空白图上画 mask（非闭合区域）\n",
    "                    img_mask = cv2.polylines(img_mask, points, isClosed=False, color=one_class['color'], thickness=one_class['thickness']) \n",
    "\n",
    "                elif one_class['type'] == 'circle': # circle 圆形标注\n",
    "\n",
    "                    points = np.array(each['points'], dtype=np.int32)\n",
    "\n",
    "                    center_x, center_y = points[0][0], points[0][1] # 圆心点坐标\n",
    "\n",
    "                    edge_x, edge_y = points[1][0], points[1][1]     # 圆周点坐标\n",
    "\n",
    "                    radius = np.linalg.norm(np.array([center_x, center_y] - np.array([edge_x, edge_y]))).astype('uint32') # 半径\n",
    "\n",
    "                    img_mask = cv2.circle(img_mask, (center_x, center_y), radius, one_class['color'], one_class['thickness'])\n",
    "\n",
    "                else:\n",
    "                    print('未知标注类型', one_class['type'])\n",
    "                    \n",
    "    return img_mask"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab79224d-f751-4d33-bfc9-50be5e4e6a9e",
   "metadata": {},
   "source": [
    "## labelme转mask-批量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "1d238f03-945a-4bc8-9c38-75de2c40188f",
   "metadata": {},
   "outputs": [],
   "source": [
    "os.chdir(Dataset_Path)\n",
    "os.mkdir('masks')\n",
    "os.chdir('images')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "8bc686bf-d781-4c43-827b-2526cf672031",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 87/87 [00:01<00:00, 45.33it/s]\n"
     ]
    }
   ],
   "source": [
    "for img_path in tqdm(os.listdir()):\n",
    "    \n",
    "    try:\n",
    "    \n",
    "        labelme_json_path = os.path.join('../', 'labelme_jsons', '.'.join(img_path.split('.')[:-1])+'.json')\n",
    "\n",
    "        img_mask = labelme2mask_single_img(img_path, labelme_json_path)\n",
    "\n",
    "        mask_path = img_path.split('.')[0] + '.png'\n",
    "\n",
    "        cv2.imwrite(os.path.join('../','masks',mask_path), img_mask)\n",
    "    \n",
    "    except Exception as E:\n",
    "        print(img_path, '转换失败', E)\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8a950a7c-1dd4-425e-84ee-b7bcdef69e0a",
   "metadata": {},
   "source": [
    "## 转换之后的mask保存在`masks`文件夹中"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9edf42ff-9f63-4c50-9dd4-64fea4e86345",
   "metadata": {},
   "source": [
    "## 重命名和删除文件夹"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "0f5169d8-c9f6-41a0-851a-1d88b2fbd1fe",
   "metadata": {},
   "outputs": [],
   "source": [
    "os.chdir('../')\n",
    "shutil.move('images', 'img_dir')\n",
    "shutil.move('masks', 'ann_dir')\n",
    "!rm -rf labelme_jsons\n",
    "os.chdir('../')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a7ee53d4-f466-488d-9c44-4933b10f4731",
   "metadata": {},
   "source": [
    "## 得到最终的语义分割数据集"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4a97b973-7cbd-4407-8aed-7bd54bed4c8f",
   "metadata": {},
   "source": [
    "## 查看数据集目录结构"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "6a712788-3796-474c-af61-cead5276b087",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "📁 Watermelon87_Semantic_Seg_Labelme/\n",
      "├─📁 img_dir/\n",
      "└─📁 ann_dir/\n"
     ]
    }
   ],
   "source": [
    "import seedir as sd\n",
    "sd.seedir('Watermelon87_Semantic_Seg_Labelme', style='emoji', depthlimit=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7fb9a63c-9019-4da6-babc-4c31c246793a",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
